A Semantically Motivated Approach to Compute ROUGE Scores
نویسندگان
چکیده
ROUGE is one of the first and most widely used evaluation metrics for text summarization. However, its assessment merely relies on surface similarities between peer and model summaries. Consequently, ROUGE is unable to fairly evaluate abstractive summaries including lexical variations and paraphrasing. Exploring the effectiveness of lexical resource-based models to address this issue, we adopt a graph-based algorithm into ROUGE to capture the semantic similarities between peer and model summaries. Our semantically motivated approach computes ROUGE scores based on both lexical and semantic similarities. Experiment results over TAC AESOP datasets indicate that exploiting the lexico-semantic similarity of the words used in summaries would significantly help ROUGE correlate better with human judgments.
منابع مشابه
Topic-level Extractive Summarization of Lectures and Meetings Using a Snippet Similarity Graph
In this paper, we present an approach for topic-level video snippet-based extractive summarization, which relies on con tent-based recommendation techniques. We identify topic-level snippets using transcripts of all videos in the dataset and indexed these snippets globally in a word vector space. Generate snippet cosine similarity scores matrix, which are then utilized to compute top snippets t...
متن کاملProbabilistic Document Modeling for Syntax Removal in Text Summarization
Statistical approaches to automatic text summarization based on term frequency continue to perform on par with more complex summarization methods. To compute useful frequency statistics, however, the semantically important words must be separated from the low-content function words. The standard approach of using an a priori stopword list tends to result in both undercoverage, where syntactical...
متن کاملMeSH: a window into full text for document summarization
MOTIVATION Previous research in the biomedical text-mining domain has historically been limited to titles, abstracts and metadata available in MEDLINE records. Recent research initiatives such as TREC Genomics and BioCreAtIvE strongly point to the merits of moving beyond abstracts and into the realm of full texts. Full texts are, however, more expensive to process not only in terms of resources...
متن کاملMulti-Candidate Reduction for Flexible Single-Document Summarization
Sentence compression techniques based on linguistically-motivated syntactic rules have proved effective in single-document summarization tasks. The addition of topic terms yields state-of-the-art performance, according to previous evaluations. Since “trimming” rules must be applied successively, optimal rule ordering presents a challenge. This paper describes the Multi-Candidate Reduction (MCR)...
متن کاملSummarizing Student Responses to Reflection Prompts
We propose to automatically summarize student responses to reflection prompts and introduce a novel summarization algorithm that differs from traditional methods in several ways. First, since the linguistic units of student inputs range from single words to multiple sentences, our summaries are created from extracted phrases rather than from sentences. Second, the phrase summarization algorithm...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1710.07441 شماره
صفحات -
تاریخ انتشار 2017